Reinforcement Learning in POMDPs: Instance-Based State Identification vs. Fixed Memory Representations

نویسنده

Joshua J. Estelle

چکیده

This paper explores an instance-based state identification technique called nearest sequence memory presented by McCallum (1994). The algorithm uses a basic k-nearest neighbor approach to solving the problem of hidden state in a reinforcement learning problem. We compare this algorithm with a more commonly used fixed memory representation, history windows.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden state and reinforcement learning with instance-based state identification

Real robots with real sensors are not omniscient. When a robot's next course of action depends on information that is hidden from the sensors because of problems such as occlusion, restricted range, bounded field of view and limited attention, we say the robot suffers from the hidden state problem. State identification techniques use history information to uncover hidden state. Some previous ap...

متن کامل

Free-energy-based reinforcement learning in a partially observable environment

Free-energy-based reinforcement learning (FERL) can handle Markov decision processes (MDPs) with high-dimensional state spaces by approximating the state-action value function with the negative equilibrium free energy of a restricted Boltzmann machine (RBM). In this study, we extend the FERL framework to handle partially observable MDPs (POMDPs) by incorporating a recurrent neural network that ...

متن کامل

Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make ...

متن کامل

Metric learning for reinforcement learning agents

A key component of any reinforcement learning algorithm is the underlying representation used by the agent. While reinforcement learning (RL) agents have typically relied on hand-coded state representations, there has been a growing interest in learning this representation. While inputs to an agent are typically fixed (i.e., state variables represent sensors on a robot), it is desirable to auto...

متن کامل

Reinforcement Learning by Policy Search

One objective of artiicial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations can be modeled as a Markov chain, whose state is partially observable to the agent and aaected by its actions; such processes are known as partially observable Markov decision processes (pomdps). While the environment's dynamics are assumed...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Reinforcement Learning in POMDPs: Instance-Based State Identification vs. Fixed Memory Representations

نویسنده

چکیده

منابع مشابه

Hidden state and reinforcement learning with instance-based state identification

Free-energy-based reinforcement learning in a partially observable environment

Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

Metric learning for reinforcement learning agents

Reinforcement Learning by Policy Search

عنوان ژورنال:

اشتراک گذاری